persuasion score
ChatCLIDS: Simulating Persuasive AI Dialogues to Promote Closed-Loop Insulin Adoption in Type 1 Diabetes Care
Yao, Zonghai, Chafekar, Talha, Wang, Junda, Han, Shuo, Ouyang, Feiyun, Qian, Junhui, Li, Lingxi, Yu, Hong
Real-world adoption of closed-loop insulin delivery systems (CLIDS) in type 1 diabetes remains low, driven not by technical failure, but by diverse behavioral, psychosocial, and social barriers. We introduce ChatCLIDS, the first benchmark to rigorously evaluate LLM-driven persuasive dialogue for health behavior change. Our framework features a library of expert-validated virtual patients, each with clinically grounded, heterogeneous profiles and realistic adoption barriers, and simulates multi-turn interactions with nurse agents equipped with a diverse set of evidence-based persuasive strategies. ChatCLIDS uniquely supports longitudinal counseling and adversarial social influence scenarios, enabling robust, multi-dimensional evaluation. Our findings reveal that while larger and more reflective LLMs adapt strategies over time, all models struggle to overcome resistance, especially under realistic social pressure. These results highlight critical limitations of current LLMs for behavior change, and offer a high-fidelity, scalable testbed for advancing trustworthy persuasive AI in healthcare and beyond.
- North America > United States > Texas > Borden County (0.04)
- North America > United States > Massachusetts > Middlesex County > Lowell (0.04)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- Asia > Middle East > Syria > Aleppo Governorate > Aleppo (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
From Internal Conflict to Contextual Adaptation of Language Models
Marjanović, Sara Vera, Yu, Haeun, Atanasova, Pepa, Maistro, Maria, Lioma, Christina, Augenstein, Isabelle
Knowledge-intensive language understanding tasks require Language Models (LMs) to integrate relevant context, mitigating their inherent weaknesses, such as incomplete or outdated knowledge. Nevertheless, studies indicate that LMs often ignore the provided context as it can conflict with the pre-existing LM's memory learned during pre-training. Moreover, conflicting knowledge can already be present in the LM's parameters, termed intra-memory conflict. Existing works have studied the two types of knowledge conflicts only in isolation. We conjecture that the (degree of) intra-memory conflicts can in turn affect LM's handling of context-memory conflicts. To study this, we introduce the DYNAMICQA dataset, which includes facts with a temporal dynamic nature where a fact can change with a varying time frequency and disputable dynamic facts, which can change depending on the viewpoint. DYNAMICQA is the first to include real-world knowledge conflicts and provide context to study the link between the different types of knowledge conflicts. With the proposed dataset, we assess the use of uncertainty for measuring the intra-memory conflict and introduce a novel Coherent Persuasion (CP) score to evaluate the context's ability to sway LM's semantic output. Our extensive experiments reveal that static facts, which are unlikely to change, are more easily updated with additional context, relative to temporal and disputable facts.
- North America > United States (0.14)
- Asia > Thailand > Bangkok > Bangkok (0.05)
- Asia > Singapore (0.04)
- (10 more...)
Context versus Prior Knowledge in Language Models
Du, Kevin, Snæbjarnarson, Vésteinn, Stoehr, Niklas, White, Jennifer C., Schein, Aaron, Cotterell, Ryan
To answer a question, language models often need to integrate prior knowledge learned during pretraining and new information presented in context. We hypothesize that models perform this integration in a predictable way across different questions and contexts: models will rely more on prior knowledge for questions about entities (e.g., persons, places, etc.) that they are more familiar with due to higher exposure in the training corpus, and be more easily persuaded by some contexts than others. To formalize this problem, we propose two mutual information-based metrics to measure a model's dependency on a context and on its prior about an entity: first, the persuasion score of a given context represents how much a model depends on the context in its decision, and second, the susceptibility score of a given entity represents how much the model can be swayed away from its original answer distribution about an entity. We empirically test our metrics for their validity and reliability. Finally, we explore and find a relationship between the scores and the model's expected familiarity with an entity, and provide two use cases to illustrate their benefits.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.04)
- North America > United States > New York (0.04)
- (12 more...)
- Research Report > New Finding (0.98)
- Research Report > Experimental Study (0.68)
- Media (0.93)
- Government > Regional Government > North America Government > United States Government (0.67)
- Leisure & Entertainment > Sports > Soccer (0.67)